Establishing an XML metadata klnowledge base to assist integration of structured and semi-structured databases

نویسندگان

  • Fahad M. Al-Wasil
  • W. Alex Gray
  • N. J. Fiddian
چکیده

This paper describes the establishment of an XML Metadata Knowledge Base (XMKB) to assist integration of distributed heterogeneous structured data residing in relational databases and semi-structured data held in wellformed XML documents (XML documents that conform to the XML syntax rules but have no referenced DTD or XML schema) produced by internet applications. We propose an approach to combine and query the data sources through a mediation layer. Such a layer is intended to establish and evolve an XMKB incrementally to assist the Query Processor to mediate between user queries posed over the master view and the distributed heterogeneous data sources. The XMKB is built in bottom-up fashion by extracting and merging incrementally the metadata of the data sources. The XMKB is introduced to maintain the data source information (names, types and locations), metainformation about relationships of paths among data sources, and function names for handling semantic and structural discrepancies. A System to Integrate Structured and Semi-structured Databases (SISSD) has been built that generates a tool for a meta-user (who does the metadata integration) to describe mappings between the master view and local data sources by assigning index numbers and specifying conversion function names. This system is flexible: users can get any master view from the same set of data sources depending on their interest. It also preserves local autonomy of the local data sources. The SISSD uses the local-as-view approach to map between the master view and the local schema structures. This approach is well-suited to supporting a dynamic environment, where data sources can be added to or removed from the system without the need to restructure the master view and to regenerate the XMKB from scratch.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Facilitating Integration of Distributed Statistical Databases Using Metadata and XML

In a distributed statistical database environment, data selection and actual statistical computation can be carried out at local databases, while data integration can bring together data produced from diverse sources, at different levels of details to produce statistical summaries. For example, the amount of beef retailed in European countries may be held in different local consumer databases a...

متن کامل

Integrating Xml with Relational Databases Using Middleware Approach

Over the past few years, XML has become the undisputable lingua franca standard both for semi-structured data representation and exchange format over the Internet, and also content management in various e-business worlds, especially the B2B and B2C enterprise applications. However, most of these organisations still rely heavily on existing relational database management systems (RDBMS) to store...

متن کامل

An Approach for Synergically Carrying out Intensional and Extensional Integration of Data Sources Having Different Formats

In this paper we propose a data source integration approach capable of uniformly handling different source formats, ranging from databases to XML documents and other semi-structured data. The proposed approach consists of two components, performing intensional and extensional integration, respectively; these are strictly coupled, since they use related models for representing intensional and ex...

متن کامل

Une approche matérialisée basée sur les vues pour l'intégration de documents XML. (A view-based approach to the integration of structured and semi-structured data)

Semi-structured data play an increasing role in the development of the Web through the useof XML. However, the management of semi-structured data poses speci c problems because semi-structured data, contrary to classical databases, do not rely on a prede ned schema. The schemaof a document is contained in the document itself and similar documents may be represented bydi erent sc...

متن کامل

Querying Semi-structured Data with Mutual Exclusion

Data analytics applications, content-based collaborative platforms and office applications require the integration and management of current and historical data from heterogeneous sources. XML is a standard data format for information. Thanks to its semi-structured-ness, it is a good candidate data model for the integration and management of heterogeneous content. However, the management of his...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006